Document Clustering Approaches using Affinity Propagation

نویسندگان

  • Aditi Chaturvedi
  • Kavita Burse
  • Rachana Mishra
چکیده

Document clustering as an unsupervised approach extensively used to navigate, filter, summarize and manage large collection of document repositories like the World Wide Web (WWW). Recently, Document clustering is the process of segmenting a particular collection of texts into subgroups including content based similar ones. The purpose of document clustering is to meet human interests in information searching and understanding. Nowadays all paper documents are in electronic form, because of quick access and smaller storage. So, it is a major issue to retrieve relevant documents from the larger database. This work will study the key challenges of the clustering problem, as it applies to the text domain. Also will discuss the key methods used for text clustering, and their relative advantages.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Text Document Clustering based on Phrase

Affinity propagation (AP) was recently introduced as an unsupervised learning algorithm for exemplar based clustering. In this paper novel text document clustering algorithm has been developed based on vector space model, phrases and affinity propagation clustering algorithm. Proposed algorithm can be called Phrase affinity clustering (PAC). PAC first finds the phrase by ukkonen suffix tree con...

متن کامل

Performance Evaluation of Affinity Propagation Approaches on Data Clustering

Classical techniques for clustering, such as k-means clustering, are very sensitive to the initial set of data centers, so it need to be rerun many times in order to obtain an optimal result. A relatively new clustering approach named Affinity Propagation (AP) has been devised to resolve these problems. Although AP seems to be very powerful it still has several issues that need to be improved. ...

متن کامل

Mixture Modeling by Affinity Propagation

Clustering is a fundamental problem in machine learning and has been approached in many ways. Two general and quite different approaches include iteratively fitting a mixture model (e.g., using EM) and linking together pairs of training cases that have high affinity (e.g., using spectral methods). Pair-wise clustering algorithms need not compute sufficient statistics and avoid poor solutions by...

متن کامل

Beyond Affinity Propagation: Message Passing Algorithms for Clustering

Beyond Affinity Propagation: Message Passing Algorithms for Clustering Inmar-Ella Givoni Doctor of Philosophy Graduate Department of Computer Science University of Toronto 2012 Affinity propagation is an exemplar-based clustering method that takes as input similarities between data points. It outputs a set of data points that best represent the data (exemplars), and assignments of each non-exem...

متن کامل

An Efficient and Fast Density Conscious Subspace Clustering using Affinity Propagation

Subspace clustering is an eminent task to detect the clusters in subspaces. Density-based approaches assume the high-density region in the subspace as a cluster, but it creates density divergence problem. The proposed work improves the performance of Density Conscious subspace clustering (DENCOS) by utilizing the Affinity Propagation (AP) algorithm to detect the local densities for a dataset. I...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014